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Abstract 



p/ I This work is devoted to the almost sure stabihzation of adaptive control systems 

Ph ' that involve an unknown Markov chain. The control system displays continuous dy- 

r~| . namics represented by differential equations and discrete events given by a hidden 

j^ I Markov chain. Different from previous work on stabilization of adaptive controlled 

systems with a hidden Markov chain, where average criteria were considered, this work 
focuses on the almost sure stabilization or sample path stabilization of the underlying 
processes. Under simple conditions, it is shown that as long as the feedback controls 
^ ' have linear growth in the continuous component, the resulting process is regular. More- 

over, by appropriate choice of the Lyapunov functions, it is shown that the adaptive 
system is stabilizable almost surely. As a by-product, it is also established that the 
controlled process is positive recurrent. 
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1 Introduction 

This work deals with almost sure stabilization of adaptive control systems in continuous-time 
with an unknown parameter process that is a hidden Markov chain. The systems belong 
to the class of partially observed control systems. Naturally, one estimates the parameter 
process by using nonlinear filtering techniques and then uses the estimator in the systems 
in order to design adaptive control strategies. The motivation of our study stems from 
consideration of the following problem. Let us begin with a hybrid linear quadratic (LQ) 
problem 

X(t) = A„(t)X(t) + 5,(t)f/(t) 

where a{t) is a continuous-time Markov chain taking values in a finite set TW = {1, . . . , m}, 
Ai and Bi for i ^ Ai are matrices with compatible dimensions, and U(t) is the control 
process. One can observe that different from the traditional setup of LQ problems, the system 
matrices A^ and B^ are both subject to random switching influence. At any given instance, 
these coefficient matrices are chosen from a set A4 with a finite number of candidates. The 
selection rule is dictated by the modulating switching process a(t) that jump changes from 
one state to another at random times. Such systems have enjoyed numerous applications 
in emerging application areas as financial engineering, wireless communications, as well as 
in existing applications. A particular important problem concerns the asymptotic behavior 
of such systems when they are in operations for a long time. Our interest lies in finding 
admissible controls so that the resulting system will be almost surely stabilized. An added 
difficulty is that the process X{t) can only be observed with an additive noise 

dX{t) = K(i)X(t) + Ba(t)U{t)]dt + dW{t). 

For such partially observed systems, it is natural to use nonlinear filtering techniques. The 
associated filter is known as the Wonham filter ^ITj, which is one of a handful of finite 
dimensional filters in existence. 

Linear quadratic (LQ) regulators appear to present rather simple structures. Meanwhile, 
there are so many applications that can be described by such processes. We refer the reader 
to [H [21 [6l [1^ for some recent work on the associated control, estimation, and optimization 
problems for hybrid systems. Emerging applications have also been found in manufacturing 
systems, in which a Markov chain is used to represent the capacity of an unreliable machine. 



in wireless communication, in which a Markov chain is used to depict randomly time varying 
signals or channels. In financial engineering, a geometric Brownian motion model for a stock 
is frequently used. The traditional setup can be described by a linear stochastic differential 
equation, where both the appreciate rate and volatility are constant. However, it has been 
recognized that such a formulation is far from realistic. Very often, there are additional 
randomness due to the variation of interest rates and other random environment factors. 
For example, the well-known Markowitz's mean-variance portfolio selection is one of the 
LQ control problems. Some recent effort for mean-variance control problems has been on 
obtaining optimal portfolio selections when both the appreciation rate and volatility depend 
on a Markov chain. For all of the applications mentioned above, practical considerations 
often lead to deal with unobservable Markov chains. In many situation, the Markov chain is 
used to model random environment. Thus, treat adaptive controls, stability, and stabilization 
of such systems will have significant impact to many applications. 

There have been continued interest in dealing with hybrid systems under a Markov switch- 
ing. In [16], stabilization for robust controls of jump LQ control problems was investigated. 
In [6], both controllability and stabilizability of jump linear LQ systems were considered. 
Stability under random perturbations of Markov chain type can be traced back to the work 
[8]. This line of work has been substantially expanded to diffusion systems in (Qj [11]. Re- 
cently, renewed interests have been shown to deal with switching diffusions; see for example 
[ini [H [ig [20] among others. 

In the literature, stabilization of continuous-time, adaptive control systems with hidden 
Markov chains were considered in [3l [5] . In both of these references, averaging criteria were 
used for the purpose of stabilization. To be more precise, adaptive control strategies were 
developed in [5] to make both the system and the control have bounded second moment in 
the sense 

limsupE[|X(t)|2 + |[/(t)p] < oo, 

whereas adaptive controls were obtained in [3J to have the second moments of the averages 
of both the system and control bounded in the sense 



limsup -E 

t^OO t- \_J Q 



\\x{s)\' + ms)\']ds 



< oo. 



In comparisons with the aforementioned references, it is a worthwhile effort to examine 
the pathwise stabilization of the associated LQ problems under partial observations. First, 



to be of any practical use in applications, the system resulting from an adaptive control 
law should not allow wild behavior in the sample paths. Secondly, owing to the use of 
adaptive control strategies, known results in stability and stabilization in Markov-modulated 
stochastic systems cannot be applied directly. As will be seen in later section, the feedback 
adaptive controls render difficulty in analyzing the underlying systems. Certain functions 
associated with the diffusion matrix in fact grow faster than normally is allowed in the 
standard analysis. When averaged criteria are used, this kind of difficulty will not show up 
since by taking expectation, we can easily average out the Brownian motion term. However, 
when pathwise criteria are the used, we can no longer use the argument based on using 
expectations. Thus the consideration of pathwise stabilization is both practically necessary 
and theoretically interesting. 

To begin our quest of finding admissible controls that stabilize the systems almost surely, 
we answer the question if the controlled process is regular. By a process being regular 
we mean that it does not have finite explosion time with probability one. We establish 
regularity under feedback controls under linear growth conditions for the feedback controls. 
Then, we develop sufficient conditions and admissible adaptive controls under which the 
system is stabilizable. Moreover, as a by-product, we also establish positive recurrence of 
the underlying processes as a corollary of our stabilization result. For a deterministic system 
given by a differential equation, if the solutions are ultimately uniformly bounded, then it is 
Lagrange stable. For stochastic systems, almost sure boundedness excludes many cases (e.g., 
any systems perturbed by a white noise). Thus, in lieu of such a boundedness, one seeks 
stability in certain weak sense. So a process is recurrent if it starts from a point outside a 
compact set, the process will return to the bounded set with probability one. We say the 
process is positive recurrent if the expected return time is finite. In fact, positive recurrence 
is termed weak stability for diffusion processes in [18]. For a practical system, no finite 
explosion time is a must. In addition, starting from a point outside of a bounded set, the 
system should be able to return to the set infinitely often with probability one. Moreover, 
the average return time cannot be infinitely long otherwise the controlled system is useless. 
Thus, regularity and recurrence of adaptive control systems can be viewed as "practical" 
stability conditions. 

The rest of the paper is organised as follows. Section 2 presents the formulation and 
preliminaries. Section 3 investigates the regularity of the underlying process. Our conclusion 



is that, as long as the feedback controls have linear growth, the resulting systems will be 
regular. Section 4 proceeds with the study of stabilization. We conclude the paper with 
some additional remarks in Section 5. In order to preserve the flow of presentation, proofs 
of a couple of technical results are postponed to two appendices to facilitate the reading. 

2 Formulation and Preliminary 
2.1 Problem Setup 

Denote by {Q, A, P) a probability space with an associated nondecreasing family of cr-algebras 
(J-'t)- Let a{t) be a continuous-time Markov chain with a finite state space A^ = {1, . . . , m} 
and transition rate matrix 11 = (ttjj) G M™^*", and W{t) be a standard R^-valued Brownian 
motion. In the above and hereafter. A' denotes the transpose of a matrix A, \A\ = ^ytT{AA') 
is the trace norm of A, and |f | = y/v^ is the usual Euclidean norm of a vector v. 

Assume throughout the paper that W{t) and a{t) are independent. Let X(t) G M" and 
U{t) G Mf^ be the state and control processes, respectively. For i G Ai, Ai G M"^" and 
Bi G M"^'^ are matrices with appropriate dimensions. Our main interest focuses on the 
following regime-switching stochastic system 

dX{t) = K(i)X(t) + B^(^t)Uit)]dt + dWit) (2.1) 

with square integrable initial condition X(0) = x As in [3l[5], denoting the column vector of 
M™ of indicator functions by 

^(t) = (I{a{t)=l)}, • • • , I{a{t)=m})' 

where ILe stands for the usual indicator function of the event E, we may present the dynamics 
of the Markov chain by 

d$(t) = U'^{t)dt + dM{t). 

The process M{t) is an ]R™-valued square integrable martingale with right continuous trajec- 
tories. The independence of a(t) and W{t) implies that of $(t) and W(t). In all the sequel, 
we also assume that x, $(t), and W{t) are mutually independent. Consider the quadratic 
cost criterion 



Jt{x,<^,U)=E, 



[X'(t)Q„(,)X(t) + U'{t)R^^t)U{t))]dt 
I Jo 



where K^^a denotes the expectation with initial conditions X{0) = x, a(0) = a, and for 
each i E Ai, Qi is a. symmetric positive semi-definite matrix, and Ri is a symmetric positive 
definite matrix. 

One of the main features of the system considered here is that the Markov chain under 
consideration is a hidden one. As treated in [3l [5], the essence is that we are deahng with 
a system (12. ip with unknown mode that switches back and forth among a finite set at 
random times. But different from previous consideration, we wish to estabhsh the regularity 
of the process and to find conditions ensuring almost sure stabilization. The almost sure 
stabilization poses new challenges and difficulties since we cannot average out the martingale 
term by means of taking expectations. Compared with the aforementioned papers, different 
techniques are needed. Here the keystone is to find a suitable Lyapunov function. 

Throughout the paper, the process X{t) is assumed to be observable, but this is not the 
case for the switching process a{t). The problem belongs to the category of controls with 
partial observations. Observing a{t) through the adaptive control process with Gaussian 
white noise brings us to the framework of the setup of Wonham filtering problems |17j . 
Denote by J^^ the cr-algebra generated by J^^ = cr{X{s),s < t}. For the problem of 
interest, a control is said to be admissible if for each t > 0, U{t) is jF^^-measurable. We are 
now in position to state precisely the problem we wish to study. 

Problem statement. Under the setup presented so far, we aim to solve the following 
problem. 

1. We analyze (12.11) and obtain conditions under which the system will be regular. Hence, 
our goal is to propose sufficient conditions ensuring the process will not have finite 
explosion time. We show that, as long as the feedback control U (as a function of x) 
has linear growth in x, the resulting adaptive control system will be regular. 

2. We design admissible adaptive controls and provide sufficient conditions that stabilize 
the closed-loop system almost surely (a.s.). Loosely, the sufficient condition ensures 
that for almost all sample points u (except a null set), the corresponding system will 
be stabilizable. The precise definition of almost sure stabilization will be provided in 
the next section. 



2.2 Preliminary 

As in [3l [5] , we convert this partially observed system to a control process with complete 
observation. It entails to replace the hidden state $(t) by its estimator, namely the well- 
known Wonham filter $(t). Using feedback control U{t) = U{X{t), $(t)), we shall need the 
following notation 



<l>,(t)=E[%(t)=,}|J-, 



n 



$(t) = ($i(t),...,S„(t))'GM"^, 

C(X(t)) = (AiX(t) + B.Uit), ..., AmXit) + BrnUit)) e M"^™, (2.2) 

D{ip) = (diag(¥?) - ip^p') for ip e W^, 

diag(v9) =diag(v9i,...,v5m)- 

Denote also the innovation process by 

dV{t) = dX{t) - C{X{t))${t)dt. 
Using the above notation, we can rewrite the converted completely observable system as 

"^Uw; V n'8(t) y''^*+^D($(t))C(x(t))'J^^^*^' ^^■'^> 

where J„ stands for the identity matrix of order n. 

Remark 2.1. Before proceeding further, we shall make a few remarks. 

• The form C{X{t)) indicates the X(t)-dependence. When the feedback control U{t) is 
of linear form, it depends on X{t) linearly. This point will be used in what follows. 

• The equivalent and completely observable system can be viewed as a controlled diffu- 
sion, in which the usual diffusion term is replaced by 

[Dm))c{x{t)y) 

and the driven Brownian motion is given by V{t). 

• When linear feedback control is used, both the drift and diffusion grow at most linearly, 
which is a useful observation. 



Since $(t) is the probability conditioned on the observation, for each t > and each 
i G M, $i(t) > with 

m 



i=l 



Denote the joint vector by Y{t) = (X(t),<l>(t))' G R"+"^. In what follows, we often 
consider |i^(t)| > r for some r > 0, where |y| is the usual Euclidean norm. Denote by 
A^(0; r) G R""^™' the neighborhood centered at with radius r. Using the notation defined 
in (12. 2p associated with the stochastic differential equation (12. 3p . we define the following 
operator. For each sufficiently smooth real- valued function h : M""'"™'/A^(0; r) h-> M, define 

Chiy) = Ch{x, if) 

^(V.(...))'(^'') (2.4) 

+ ^tr((/„ C{x)D{^y)W^h{x,^){h C{x)D{^)')') 

where V/i and V^/i are the gradient and Hessian of /i, respectively. 

3 Regularity 

First, let us recall the definition of regularity. According to [9J, the Markov process 

^■")^(fi:,') 



is regular, if for any < T < oo. 



P( sup \Y{t)\ = oo ) =0. 



Roughly speaking, regularity ensures the process under consideration will not have finite 
explosion time. For our adaptive control systems, we proceed to show that under linear 
feedback control, the systems is regular. 

Theorem 3.1. Assume that the feedback control U{t) = U{X{t),^{t)) is admissible and 
that it grows at most linearly in X{t). Then, the feedback control system (12. 3p is regular. 

Remark 3.2. In fact, for our problem, we are mainly interested in linear (in x variable) 
feedback controls. In this case, the linear growth condition is clearly satisfied. 



8 



Proof. Let G be an open set in M"+™ and denote 

m 

O = \y = [x, (p)' E Q,(p = (v^i, . . . , (fm) satisfying (fi > for i E M., and y. ^i = ^\- 



i=l 



We first observe that both the drift and the diffusion coefficient given in fl2.3p satisfy the 
hnear growth and Lipschitz condition in every open set in O C M"'^™'. Thus, to prove the 
regularity, using the resuh in [9j, we only need to show that there is a nonnegative function 
U which is twice continuously differentiable in Or = {y E O, \y\ > r} for some r > with 
y = (x, ip)' such that 

(3.1) 



inf U{y) — > cxD as i? — »• cxo, 

\y\>R 



and that there is an 7 > satisfying 



CUiy) < ^U{y). 



(3.2) 



Thus, to verify the regularity of the process Y{t), all needed is to construct an appropriate 
Lyapunov function U. Note that we only need a Lyapunov function that is smooth and 
defined in the complement of a sphere. Equivalently, we only need the smoothness of the 
Lyapunov function to be in a deleted neighborhood of the origin. To this end, take r = 1 
and denote by Oi the set 



Oi = \y = [x, (p)' e ]R"+"", \y\ > 1 and p> = {ipi, ..., (pm) satisfying 



if, 



> for z G A^, and ^ v^j = 1 i. 



(3.3) 



Define U : d 
we have 



as tl{y) = \y\. It is easily checked that condition (13.11) holds. Moreover, 



V 



and 



/, 



n+m 



XX Xip 

ipx' ipip' 

X 



Consequently, it follows from (12. 4p that 
1 



CU{x, if) 




x'C{x)ip + if'U'ip) 



n + tT{D{ip)C'{x)C{x)D'{^))) 



XX xip 



(3.4) 



tr (/„ C{x)D{v)'\ 



iV 



In cix)Diipy) 



Note that the set that we are working with is Oi defined in (13.31) . In particular, the use 
of Oi yields that for any y G Oi, \ip\ is always bounded. We also note that owing to the 
definition of C{x) and the linear growth feedback controls used, C{x) is a function grows at 
mostly linearly in x. To proceed, henceforth, use 7 as a generic positive constant with the 
convention that 7 + 7 = 7 and 77 = 7 in an appropriate sense. It follows that for the terms 
on the third line from bottom of (13.41) . for \{x, (f)'\ large enough, 

1 ^ / nr- \ 2 



x'C{x)ip + ip'U'^ 



< 



7 



<7 



Likewise, for the next two term, we have 

1 



n + tT{D{y^)C'{x)C{x)D'{^)) 



<1 



Combining the above estimates, we can deduce that 



CU{x, v?) < 7 



-fU{x, if) 



for some 7 > 0. Consequently, the second condition (13.21) is satisfied. Thus the regularity of 
the feedback control is obtained. D 

4 Stabilization 

In this section, we establish conditions under which the system of interest is stabilizable in 
the almost sure sense. We first present the definition and then proceed to find sufficient 
conditions for stabilization. 
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Definition 4.1. System (12.11) or equivalently (12.31) is said to be almost surely stabilizable 
if there is a feedback control law U(t) such that the resulting trajectories satisfy 

limsup - log |X(t)| < almost surely. (4.1) 

t— >oo t 

Note that the definition given in (14. ip is natural. When studying stability of stochastic 
differential equations, especially for pathwise stability, one uses the so-called gth-moment 
Lyapunov exponent 

limsup -log|X(t)|^ 

for some g > 0. Here, roughly, we require that under the control law, the first-moment 
Lyapunov exponent is non-positive. 

4.1 Auxiliary Results 

Before proceeding further, let us first recall a lemma, which is concerned with the existence 
of the associated system of Riccati equations when quadratic cost criteria are used. The 
proof of the lemma is given in [7] . 

Lemma 4.2. Consider the system of Riccati equations 

m 

A'.Pi + PiAi - PiBiR-^B[Pi + J2 ^ijP] +Q = 0, zeM, (4.2) 

where Q G M"^" is symmetric and positive semi-definite, and R G R"^^™ is symmetric and 
positive definite. The system (14. 2 p has a solution if and only if for each i & Ai, there is a 
matrix Pi satisfying 

m 

A'^, + P,A, - P,Ai - P,B,R-^B';Pi + ^ 7r,,P, + Q < 0. (4.3) 

Furthermore, if Q is positive definite, so are Pi for i (^ M.. 

To carry out the analysis, we need some auxiliary results on the bounds of the quadratic 
variation process. Before getting the almost sure bounds, we examine the moment bounds 
for certain related martingales, which turn out to be interesting in their own right. The main 
ingredient is the use of properties of the associated Markov chain. 
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Moment Bounds 

Proposition 4.3. Consider the stochastic differential equation 

d${t) = U'${t)dt + D{${t))C{X{t))'dV{t) (4.4) 

and define the associate martingale 

N{t) = f D(${s))C{X{s))'dV{s). (4.5) 

Jo 

Suppose that the Markov chain a{t) is irreducible. Then, for some positive constant K 

independent oft, 

< K. (4.6) 



E 



\\N{t)? 



Proof. The proof is given in Appendix A. D 

Remark 4.4. It follows from the proof of Proposition 14.31 that the limit of the matrix 

S= \im- [ [ U'[${u) - z/][$(s) - u]'Ududs 
*^°° t Jo Jo 

is finite. Clearly, this matrix is symmetric and positive semi definite. A moment of refiect 

reveals that we can further prove the asymptotic normality. That is 

—= / n'[$(s) — h']ds converges in distribution to A/'(0, S) as t ^ oo. 
vt Jo 

That is, a normalized sequence defined on the left-hand side above converges in distribution 
to a normal random vector with mean and covariance S. 

Another ramification is that in lieu of considering the second-moment bounds, we can 
deal with gth-moment bounds. In fact, using the same techniques, we can show that for any 
integer p > 0, 



Hence, as the solution (14.41) is given by 



1 '■^ 



2p 

n'[$(s) - u]ds 



< CX3. 



$(t) = $(0)+ / U'<l>{s)ds + N{t) 
Jo 



which means that 



N{t) = 8(t) - $(0) - f U'${s)ds, 

Jo 
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we obtain that 



E 



Vi 



N{t) 



2p 



< oo. 



Next, for odd exponents and for any integer p > 1, it follows from Holder's inequality that 



E 



7!"'''' 



2p-l 



2p 



< E 



71"^'^ 



2p 



2p-l 



< CXD. 



Finally, we conclude that for any positive integer q, 



E 



Vt 



N{t) 



< CXD. 



Almost Sure Bounds 



For the almost sure stabilization, we need to show that 



-\N{t)\^<K a.s. 



for some -ft' > independent of t. 



Proposition 4.5. Consider i \AA\\ and suppose that the Markov chain a{t) is irreducible. 
Then, the quadratic variation of the process N{t) satisfies {N, N)^ < Kt where K is some 
positive constant independent oft. Therefore, 



lim -N{t) = a.s. 

t—fOO t 



Proof. The proof is given in Appendix B. D 

4.2 Stabilization 

Lemma 4.6. Consider the set A defined by 

A = <(x,ip) E M" X M"*, Lf = (ifi, . . . , (fj^) satisfying ^i > and 2, V^j = 1 r • 

Denote 



(4.7) 



i=l 



P{ip) = Y,P^Vi■ 



(4. 



i=l 
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For some 9 > 0, let Ve^x, ip) : A 

1 



CVe^x.ip) 



M with Ve{x, (f) = \og{6 + x'P{ip)x). Then, we have 
2x'P{ip)C{x)^ + {x'PxyW^^ 



where 



tr (J„, C(x)D(v9y)A(x,</^)(/„, C{x)D{if) 



6 + x'P{(p)x 
1 

~2{9 + x'P{ip)x) 



^^^'^)=( x'Px )[ x'Px J' 

P = (Pi, . . . , P^)', x'Px = {x'PiX, ..., x'PmX)' e 

Px = (Pix, . . . , Pmx) G M"^™, x'P = {Px)' e M" 



'\' 



(4.9) 



Proof. We have 



Vlog{9 + x'P{ip)x) 



2P{ip)x 
x'Px 



6 + x'P{ip)x 



and 



V^ log(e + x'P((^)x) 



2P(^)x\ (2P{^)xV ^(Piv) Px 
x'Px ) \ x'Px ) V x'P 0™ 



+ x'P{ip)x)^ 



9 + x'P{(p)x 



where 0^ stands for a square matrix of order m with all entries equal to zero. Consequently, 
it follows from (12^ that 

^ -^^ f 2x'P((/?)C(x)y? + (x'Px)'nV') 

(4.10) 



/:ve(x, v?) 



+ x'P((/?)xV '"^^ V /^ V / ^; 
+^tr((/„, C(a:)Z}(^)') We(x,^) (4 , C(x)D(^)')' 



which immediately implies (14.91) . D 

For the purpose of stabilization, we also need an estimate on CVg{X{t), $(t)). 



Lemma 4.7. Assume that equation (14. 3 p is satisfied and that 

1 



Q 



PiBi - P,B, 



R-' 



P,Bi - RB 



j-^j 



are positive definite matrices for all {i,j) G Ai^ where Pi for i E M. are the solutions of the 
algebraic Riccati equations given by (14.21) . Then, the infinitesimal generator of the process 



14 



(X(t), $(t)) associated with the feedback control law 

m 

U{t) = -R-' Yl Ht)B[PiX{t), (4.11) 

satisfies for some constant 7 > 

CVe{X{t)Mt))<]- (4.12) 

Proof. We can deduce from Lemma [4.61 that 
LVe{X{t)3{t)) < K. (2X(t)'P($(t))C(X(t))$(t)) 

9 + x{typmt))x{t) 

+ K. ((x(t)'px(t))'n'8(t)) 

e + x{typ{<i>{t))x{t) 

+ L trfp($(t)) + 2C(X(t))Z}($(t))'X(t)'p). 

9 + x{typmt))x{t) \ ' ' '^ '^ J 

Therefore, following exactly the same lines as in [5j, we obtain that 

CVe{X{t)Mt)) < ^-^ (x{ty\Q 

^__+X(t)'P(<l>(t))X(t)V ''r 

$,(t)$,(t) 



- E E ^^^ [p^B. - p,p,] p- [p.p. - p,p, 

i=l j=l 

m m 

+ (X.^A^)B'^P^) R-'(J2^^imP^)]x{t) -tT{P{${t))) 



j=l 1=1 

Finally, 

/:v,(x(t),$(t))<-Etr(p.) 






which completes the proof of Lemma 14 .71 D 



Theorem 4.8. Assume that the conditions of Lemma 14.71 are satisfied. Then, the feedback 
control law defined in equation (14. lip stabilizes the system (12.31) almost surely. 



Proof. It follows from Ito's rule that 

ft 



Ve{X{t), $(t)) = Ve{x, if) + [ CVe{X{s), ^s))ds + M{t) (4.13) 

Jo 

with the initial condition X(0) = x and $(0) = ip and the martingale term 



M{t) = f J:{s)dV{s) 
Jo 

15 



where 

e + xisyi 

1 



.. (2X(s)'P($(s)) (X(s)'PX(s))') n^;?;^ ^^A^^-^ ^^/ 

e + X(s)'P(<l>(s))X(s) V ; V V ;; V V ; v ;; ^ I^Z}($(s))C(X(s))' 

(2X(.)'P($(.)) + (X(.)'PX(.))'Z}($(.))C(X(.))'). 



+ X(s)'P(<l>(s))X(s 
We can split the martingale M{t) into two terms, M{t) = Ni{t) + A^2(^) with 

Jo 9 + X{syPms))X{s) 

Jo 9 + X{syP{<!>{s))X{s) 
It is easy to see that 

4X(t)-P($(t))P($(t))X(t) ^ ^^ 1.1,^^^ rP^^ 

— — — ;^— - — -——< Ki where i^i = — max(A„a^.(Pi)). 

(^ + X(t)'P(<l>(t))X(t))2 ^ iGA4' 

Then, the quadratic variation of Xi(t) satisfies (Xi, Xi)^ < Kit a.s. Consequently, we deduce 
from the strong law of large numbers for local martingales [12] that 

lim-Xi(t) = a.s. (4.14) 

i^oo t 

In view of Proposition 14.51 one can also find a positive constant K2, independent of t, such 
that 



\x(sypx(sW 



(X2,X2), = / - — ' // ^^;/' ^ ^^ \Dms))c{x{s)y\'ds 



/o (^ + X(s)'P(<l>(s))X(s))2 (4.15) 

< X2t a.s. 

It also ensures that 

lim -X2(t) = a.s. (4.16) 

t-*oo t 

Therefore, (14.141) and (14.161) imply that 

lim -M(t) = a.s. (4.17) 

t— >oo t 

Thus, we find from (I4.13P that 

^Ve(X(t), 8(t)) = \veix, <^) + 7 /" CVeiX{s), ^s))ds + o(l) a.s. 



t ^ ' ' ' " t ^ ' ' t JO 
Moreover, Vg{x,ip)/t = o(l) as t ^ 00 a.s. By virtue of Lemma [4.71 it follows that for all 

lim sup -VeiX{t), $(t)) = lim sup - f CVg{X{s), ${s))ds < ^ a.s. (4.18) 

i-+oo t t—*oo t Jo ^ 
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Furthermore, one can observe that x'P{ip)x > Amin(-P(v^))|a;p and since P{'^) is positive 
definite, the minimal eigenvalue of P{(p) is positive. Consequently, 

log(A,ni„(P(v^))) + 2 log(|x|) < log(^ + A^i„(P(y.))|xn < \og{9 + x'P{ip)x) 

which leads to 

^(log(A^in(P($(t)))) +21og(|X(t)|)) < jVoiXit)Mt))- (4.19) 

Finally, we conclude from fl4.18p and fl4.19p that for all 61 > 0, 

1 7 

limsup - log |X(t)| < — a.s. 

t 20 

We complete the proof of Theorem 14.81 by taking the limit as 6 tends to infinity. D 

Remark 4.9. Normally, dealing with stochastic differential equations, to obtain the almost 
sure bounds of the solutions, one often relies on the use of appropriate Lyapunov functions 
to have the diffusion term of the process be bounded after a transformation. Here, we are 
dealing with a martingale term with some what faster rate of growth in x. Nevertheless, 



thanks to the second component of the diffusion (14.41) . the probabilistic meaning of ^(t) 
enables us to work around the obstacle. To obtain the desired bounds, an alternative is to 
obtain an almost sure central limit theorem. Here however, we take a different approach. 
The main point is the use of Proposition 14.51 

Recall the notion of recurrence for the diffusion process {X{t), $(t)) starting at X{0) = x 
and $(0) = if. Consider an open set O with compact closure, and let 



cTo"'' = inf |t > 0, (X(t), i(t)) e O} 



be the first entrance time of the diffusion to the set O. If {X{t), $(t)) is regular, it is recurrent 
with respect to O if PjcrQ"^ < oo} = 1 for any {x, ip) G O^, where O'^ is the complement 
of O. A recurrent process with finite mean recurrence time for some set O, is said to be 
positive recurrent with respect to O, otherwise, the process is null recurrent with respect to 
O. It has been proven in [9j that recurrence and positive recurrence are independent of the 
set O chosen. Thus, if it is recurrent (resp. positive recurrent) with respect to D, then it is 
recurrent (resp. positive recurrent) with respect to any other open set in the domain of 
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interest. Looking over the proof of the stabihzation presented, we could show that for the 
Lyapunov function 

Vo(x,v3) = \og{x'P{ip)x), 

one can find 7 > such that for all (x, ip) G O^, 

jCVoix,ip)<-^. (4.20) 

In view of the known result of positive recurrence of diffusion processes |9], fl4.20p is precisely 
a necessary and sufficient condition for positive recurrence. Thus, we obtain the following 
result as a by-product. 

Corollary 4.10 Under the conditions of Theorem \A.8\ with the control law (14. lip used, the 
diffusion systems (12.30 is positive recurrent. 

We would like to add that the positive recurrence of the process is an important property. 
It has engineering implication for various applications. Essentially, it ensures that starting 
from a point outside of a bounded set, the control laws enables the system to return to a 
compact set almost surely. This may be viewed as a practical stability condition. In fact, 
Wonham used the term weak stability for such a property in his paper [T8] . 

5 Further Remarks 

This paper has been concerned with stabilization in the almost sure sense of an adaptive 
control system with linear dynamics modulated by an unknown Markov chain. Under the 
framework of Wonham filtering, the underlying system is converted to a fully observable 
system. Using feedback control that is linear in the continuous state variable, we establish 
pathwise stabilization of the process. Along the way of our study, we have also obtained 
regularity of the underlying process. In addition, as a corollary, we have shown that under the 
stabilizing control law, the resulting system is positive recurrent. These results pave a way 
for practical consideration of stabilization of adaptive controls of LQ systems with a hidden 
Markov chain. Several directions may be worthwhile for further study and investigation. 

• In our study, irreducibility of the Markov chain is used. We note that the irreducibil- 
ity ensures the spectrum gap condition or exponential decay in ( \A.6\i and (JA.OP of 



Proposition 14.31 to hold. It will be interesting to see if it is possible to remove this 

18 



condition. Our initial thoughts are: Under certain conditions, this might be possible. 
For example, if the Markov chain has several irreducible classes such that the states in 
each class vary rapidly, and among different classes, they change slowly. One may be 
able to use the different time scales to overcome the difficulty under the framework of 
time-scale separation using a singular perturbation approach. However, the details on 
this need to be thoroughly worked out; they are in fact out of the scope of the current 
paper. 

It will be interesting to design admissible controls and find sufficient conditions for 
stabilization of LQ systems with a hidden Markov in discrete-time. 

In our setup, the process X{t) represents the noisy observation-hidden Markov chain 
observed in white noise. A class of controlled regime-switching diffusion systems pro- 
vides a somewhat more complex setup. In such a system, the dynamics are represented 
by switching diffusions with a hidden Markov chain. The Markov chain is not observ- 
able but can only be observed in another Gaussian white noise. That is, let us consider 
the controlled system 

dY{t) = K(j)F(t) + B^^t)U{t)]dt + cr^it)dV{t) 

(5.1) 
dX{t) = g^^t)dt + p{t)dW{t), 

where Y{t) and X{t) are vector- valued processes with compatible dimensions repre- 
senting the state and observations, respectively, V{t) and W{t) are independent multi- 
dimensional Brownian motions, and a{t) is the hidden Markov chain with a finite state 
space. As was alluded to in the introduction, one of the motivations is Markowitz's 
mean- variance portfolio selections [I9j. One may then pose similar stabilization prob- 
lems. 

Recently, using regime-switching jump diffusions, which are switching diffusions with 
additional external jumps of a compound Poisson process, for modeling surplus in in- 
surance risk has drawn much attention. A related problem in the adaptive setup is 
a regime- switching jump diffusion system in which the hidden Markov chain is ob- 
served similar to the observation in (15.11) . One may then proceed with the study of 
stabilization problems. 
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• In the study of stabilization, positive definiteness of certain matrices is used (see 
Lemma 14.21) . A challenging problem is to investigate the stabilization problem with 
the positive definiteness removed for the system given by (15. ip . Here, the crucial point 
seems to rely on recent developments in LQ problems with indefinite control weights 
[1]. One needs to use the backward stochastic differential equations from the toolbox 
of stochastic analysis. 

All of these problems deserve further study and investigation. 

Appendix A. 



This appendix is devoted to the proof of Proposition 14.31 It is divided into several steps. 
Step 1. We already saw that the solution (14.41) is given by 



$(t) = $(0)+ / U'^{s)ds + N{t). 
Jo 



Consequently 



N{t) = <l>(t)-$(0) 



U'${s)ds. 



(A.l^ 



In view of (lA.ll) . the probabilistic interpretation of $(t) implies that N{t) is a martingale 
bounded almost surely for each t > 0. We proceed to obtain the moment bounds of N{t). 
Step 2. As n is the generator of the irreducible Markov chain a{t), its unique stationary 
distribution u satisfies IIV = 0. Hence, it follows that 



H'$(s)cis 



On the one hand, we clearly have from (14. 5 p 



H'(8(s) - iy)ds. 



E[|Ar(t)p 



E 



\Dms))C{X{s)y\'ds 



On the other hand, we deduce from flA.ip that 



-E||iV(«)|- 



t 

< -E 
~ t 



$(t) - $(0) - f H'($(s) - u)ds ^ 
Jo 

2^ 
+ -E 



2 2 

< - + -E 
- t t 



|$(t)-$(0)| 
t ft 



H'($(s) - p)ds 



Jo 



tr{H'H($(r) - z/)($'(s) - v')}drds. 



(A.2) 



(A.3) 
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Consider the symmetric matrix 

G(r, s) = ig,,ir, s)) = E[(8(r) - u)($'{s) - u')]. 
One can observe that 

= E[{E[l{^^r)=^}\J'^] - Z/.)(E[]I|«(,)=,}|^f ] - U,)], 

= E[E[I[{,(,)=,}|^,^]E[]I|,(,)=,}|^f ]] - z/,E[E[]I{,(,)=,||^,^] 

- z/iE[E[]I|Q,(^)=j-}|J^f]] +UiUj, 

= E[E[%(,)=,||^,^]E[%(,)=,}|^f ]] - z/,P(a(r) = i) 

- z/jP(a(s) = j) + ViVj. 

Note also by the Fubini Theorem that 

f 9ij{r, s)drds = - i / gij{r,s)drds + / gij{r,s)drds] . 

} pt / W \ T rt / rt 



(A.4) 



t -'o \Jr ' t 



gij{r, s)ds j dr + - I ( / gij{r, s)dr ) ds, 
= g,{t)+g2{t) = 2g,{t). 

We have the decomposition 

g^{t) = hit) + £i{t) 



where 



1 rt / rt 



hi{t) = - I I / h{r,s)ds | dr, 

^iW = 1 [ ( [ ^^i^3 - P("(S) = J))d^ ) ^^. 



with 

/i(r, s) = P(a(r) = i)P(a(s) = j|a(r) = i) - VjP{a{r) = i). 

Before proceeding further, let us first note the following mixing properties regarding the 
Markov chain a{t). For all t > and s <t, denote 

p{t) = (P(a(t) = 1), . . . , P(a(t) = m))' G M'", 

P{t, s) = ((P(a(t) = j\a{s) = t), 1,3 e M) e M'"^™, 

which are the probability vector and transition matrix of the Markov chain a{t), respectively. 
Since a(t) is irreducible, it is ergodic. Consequently, as t goes to infinity, for the solution of 
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the system 



dp{t) 

dt 
p(0) 



U'p{t) 
Po 



(A.5) 



satisfying 



Po,i > and ^Po,i = 1, 
one can find two positive constants k, and K such that p{t) -^ v and 

\p(t) ~ v\<i K exp{—K,t) 
By virtue of ( lA.Op . it is easily seen that 



(A.6) 



Ut)\ 



< 



t 



\Jr 
t / rt 



{u.j -P(a(s) =3))ds\ dr 
|z/j -P(a(s) =j)\ds) dr, 



'0 \Ju 

ViK f' f f* , , , , , 

< / I / exp(— Ksjas ) dr, 

< / exp(— Krjar, 



(A.7) 



K,t 







^ .2 



Consequently, Ciit) goes to zero as t tends to infinity. Next, we shall show that hiit) is 
bounded. As before, the solution of the system 

dP{t, s) 



dt 

P(s,s) 



U'P{t,s) 



(A.l 



with s <t, also satisfies for two positive constants A and K, P{t, s) — > \u' and 

|P(t, s) - Iz/'l < irexp(-A(t - s)). 
It follows from flA.9p that 
\hiit)\ 

<- / I / |P(q;(s) = j|Q;(r) = i) — z/j|ds ) dr, 
< 



(A.9) 



- / P(a(r) = i) { (P(a(s) = j|a(r) = z) - Uj)ds ] dr 



— / ( / exp(— A(s — r))ds j dr. 



< — . 
- A 
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Therefore, hi{t) as well as gi{t) are bounded sequences which ensures that for some positive 
constant K independent of t 

ft ft 



E 



Jo 



tr{n'n($(r) - z/)(<l>'(s) - u')}drds 



< K. 



(A.IO) 



Finally, ( JA.21) together with ( JA.3J) and ( lA.lOl) imply ( 14.60 which completes the proof of 
Proposition I4.3[ D 

Appendix B. 

We shall now focus on the proof of Proposition I4.5[ First of all, we know that 

sup|n'$(t)| < 1 a.s. 

i>0 



In addition, we also have |$(t)| < 1 a.s. Consequently, it follows from (lA.ip that 



|Ar(t)|<|$(t)-$(0)| + 
For each i E M., denote 



n'$(s)c/s 



< 1 + t 



a.s. 



(B.l^ 



nt ri 

N,{t) = / Y.WHs)C{X{s))%dV,{s) 
-^0 ,=1 



where [D($(s)C(X(s))']jj is the ijih entry of the matrix D($(s))C(X(s))' and Vj{s) stands 
the jth component of V{s). It follows from the well-known Doob's martingale inequality 
given for example in p^ Theorem 1.7.4, p. 44] that for each i E M. and each positive 
integer n, 

ft 



P( sup 

vO<i<n 



N,{t)~\{D{^{s))C{X{s))Wds 



> log 72 ) < 



n^ 



(B.2) 



where (D(<l>(s)C(X(s))')j^. denotes the row vector in the ith row of the matrix D{(^{s)C{X{s))' . 

Hence, we deduce from the Borel-Cantelli Lemma that for almost all tu G fi, there is a 

Ki = Ki{uj) > 1 such that for all n > Ki and t < n 

ft 
\{D{${s)C{X{s)y)ifds < logn + Ni{t) a.s. 

< logn + 1 + t a.s. 



(B.3) 
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The last line above follows from (IB.ip . Dividing both sides of ( ]B.3|) by t, we obtain that for 
n > K2, n — 1 < t < n, so 

- [ |(D($(s)C(X(s))')i \^ds < ^^(logn + 1 + t) a.s. 
t Jo ^ n-1 

< {\ogn + l + n) a.s. (^-4) 

n — 1 

< K3 a.s. 

and the bound K^ is independent of t. Consequently, for some positive constant K indepen- 
dent of t, the quadratic variation of the martingale is bounded by Kt almost surely. That 
is, (IB. 41) implies that (A^, A^)^ < Kt a.s. Finally, we deduce from the strong law of large 
numbers for local martingales [12] that 

lim -N{t) = a.s. 

t— >oo t 

which concludes the proof of Proposition 14.51 D 
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